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Abstract 

We consider a model for gene regulatory networks that is a modification of Kauff- 
mann's (1969) random Boolean networks. There are three parameters: n = the number 
of nodes, r = the number of inputs to each node, and p = the expected fraction of l's 
in the Boolean functions at each node. Following a standard practice in the physics 
literature, we use a threshold contact process on a random graph on n nodes, in which 
each node has in degree r, to approximate its dynamics. We show that if r > 3 and 
r ■ 2p{\ —p)>l, then the threshold contact process persists for a long time, which cor- 
respond to chaotic behavior of the Boolean network. Unfortunately, we are only able 
to prove the persistence time is > exp (cn b ( p )) with b(p) > when r ■ 2p(\ — p) > 1, 
and b(p) = 1 when (r — 1) • 2p(l — p) > 1. 
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1 Introduction 



Random Boolean networks were originally developed by Kauffman (1969) as an abstraction 
of genetic regulatory networks. In our version of his model, the state of each node x G V n = 
{1,2,..., n} at time t — 0, 1, 2, ... is r) t (x) G {0, 1}, and each node x receives input from r 
distinct nodes yi(x), . . . ,y r (x), which are chosen randomly from V n \ {x}. 

We construct our random directed graph G n on the vertex set V n = {l,2,...,n} by 
putting oriented edges to each node from its input nodes. To be precise, we define the 
graph by creating a random mapping (f> : V n x {1, 2, . . . , r} — > V n , where (f>(x,i) = yi(x), 
such that Ui{x) ^ x for 1 < i < r and Ui(x) ^ Uj(x) when i ^ j, and taking the edge set 
E n = {(yi(x),x) : 1 < i < r, x G V n }. So each vertex has in-degree r in our random graph 
G n . The total number of choices for <fi is [(n — l)(n — 2) • • • {n — r)] n . However, the resulting 
graph G n will remain the same under any permutation of the vector y x = (yi(x), . . . , y r {x)) 
for any x G V n . So if e zx G {0, 1} is the number of directed edges from node z to node x in 
G n , then ^2=1 e z,x = r, and the total number of permutations of the vectors y^, 1 < x < n, 
that correspond to the same graph is (r!) n . So if P denotes the distribution of G n , then 

(r!) n 1 
F(e zx , l<z,x<n)= [(n _ 1)(n _ 2) . . . (n _ r)] „ = Jp^s, 

if e z>x G {0, l},e XjX = and YT z =i e zx = r for all x G V n , and F(e zx , 1 < x, z < n) = 
otherwise. So our random graph G n has uniform distribution over the collection of all 
directed graphs on the vertex set V n in which each vertex has in-degree r. Once chosen the 
network remains fixed through time. The rule for updating node x is 

Vt+i(x) = f x (r)t(yi(x)), ■ ■ ■ ,Vt(y r (x))), 

where the values f x (v), x G V n , v G {0, l} r , chosen at the beginning and then fixed for all 
time, are independent and = 1 with probability p. 

A number of simulation studies have investigated the behavior of this model. See 
Kadanoff, Coppersmith, and Aldana (2002) for survey. Flyvberg and Kjaer (1988) have 
studied the degenerate case of r = 1 in detail. Derrida and Pommeau (1986) have argued 
that for r > 3 there is a phase transition in the behavior of these networks between rapid 
convergence to a fixed point and exponentially long persistence of changes, and identified 
the phase transition curve to be given by the equation r • 2p(l — p) = 1. The networks 
with parameters below the curve have behavior that is 'ordered', and those with parameters 
above the curve have 'chaotic' behavior. Since chaos is not healthy for a biological network, 
it should not be surprising that real biological networks avoid this phase. See Kauffman 
(1993), Shmulevich, Kauffman, and Aldana (2005), and Nykter et al. (2008). 

To explain the intuition behind the conclusion of Derrida and Pomeau (1986), we define 
another process {Ct^) : t > 1} for x G V n , which they called the annealed approximation. 
The idea is that Ct( x ) = 1 if and only if r) t (x) 7^ rj t _i(x), and Ct( x ) — otherwise. Now if 
the state of at least one of the inputs yi(x), . . . ,y r (x) into node x has changed at time t, 
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then the state of node x at time t + 1 will be computed by looking at a different value of f x . 
If we ignore the fact that we may have used this entry before, we get the dynamics of the 
threshold contact process 

P (Ct+i(x) = 1| ( t ( yi (x)) + ■■■ + ( t (y r (x)) > 0) = 2p(l - p), 

and ( t +i{ x ) — otherwise. Conditional on the state at time t, the decisions on the values of 
Ct+i( x ), x £ Ki) are made independently. 

We content ourselves to work with the threshold contact process, since it gives an approx- 
imate sense of the original model, and we can prove rigorous results about its behavior. To 
simplify notation and explore the full range of threshold contact processes we let q = 2p(l— p), 
and suppose < q < 1. As mentioned above, it is widely accepted that the condition for 
prolonged persistence of the threshold contact process is qr > 1. To explain this, we note 
that vertices in the graph G n have average out-degree r, so a value of 1 at a vertex will, on 
the average, produce qr l's in the next generation. 

We will also write the threshold contact process as a set valued process. Let £ t = {x : 
(t{%) — 1}- We will refer to the vertices x e £ t as occupied at time t. So if Pq is the 
distribution of the threshold contact process £ = {£ t : t > 0} conditioned on the graph G n , 
then 

P G (xe& + i|{yi(x),...,j/ r (x)}n&^0) = q, and 
P G (xe^ +1 \{y 1 (x),...,y r (x)}n^ = Hf) = 0. 

Let £ A = {^f : t > 0} denote the threshold contact process starting from £q = A C 14, 
and £ x = : t > 0} denote the special case when A = V n . Let p be the survival probability 
of a branching process with offspring distribution p r = q and p = 1 — q. By branching 
process theory 

p=l-9, where 9 E (0, 1) satisfies 9 = l-q + q6 r . (1.1) 
Using all the ingredients above we now present our first result. 

Theorem 1. Suppose q(r — 1) > 1 and let 5 > 0. Let P denote the distribution of the 
threshold contact process starting from all sites occupied, on the random graph G n , which 
has distribution P. Then there is a positive constant C(S) so that as n — > oo 

inf P fJ^l > -> 1. 

<<cxp(C*(<5)n) \ 71 / 

To prove this result, we will consider the dual coalescing branching process £ = {£ t : 
£ > 0}. In this process if x is occupied at time t, then with probability q all of the sites 
yi(x), . . . ,y r (x) will be occupied at time t + 1, and with probability 1 — q none of them 
will be occupied at time t + 1. Birth events from different sites are independent. Let 

£ = {£f : t > 0} be the dual process starting from £q = A C V n . The two processes can 
be constructed on the same sample space so that for any choices of A and B for the initial 
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sets of occupied sites, $, and $, satisfies the following duality relationship, see Griffeath 
(1978). 

{ZfnB^®} = {£?nA^(&}, i = 0,l,2,.... (1.2) 
Taking A = {1,2, ... ,n} and B = {x} this says 

{xG^} = {eV0}, (1.3) 
or, taking probabilities of both the events above, the density of occupied sites in at time 

- {x} 

t is equal to the probability that £ survives until time t. Since over small distances our 
graph looks like a tree in which each vertex has r descendants, the last quantity ~ p. 

From (11.21) it should be clear that we can prove Theorem [1] by studying the coalescing 
branching process. The key to this is an "isoperimetric inequality". Let G n be the graph 
obtained from our original graph G n = (V n , E n ) by reversing the edges. That is, G n = 
(V n ,E n ), where E n = {{x,y) : (y,x) G E n }. Given a set U C V n , let 

U* = {y G V n : x — > y for some x G U}, (1.4) 

where x — > y means (x,y) G E n . Note that U* can contain vertices of U. The idea behind 
this definition is that if U is occupied at time t in the coalescing branching process, then the 
vertices in U* may be occupied at time t+1. 

Theorem 2. Let E(m,k) be the event that there is a subset U C V n with size \U\ = m so 
that \U*\ < k. Given r) > 0, there is an e (r)) > so that for m < e n 

P [E(m, (r — 1 — i])m)] < exp(— rjmlog(n/m)/2). 

In words, the isoperimetric constant for small sets is r — 1. It is this result that forces us to 
assume q(r — 1) > 1 in Theorem [lj 

Claim. There is a c > so that if n is large, then, with high probability, for each m < cn 
there is a set U m with \U m \ = m and \U^\ < 1 + (r — l)m. 

Sketch of Proof. Define an undirected graph H n on the vertex set V n so that x and y are 
adjacent in H n if and only if there is a 2 so that x — > z and y — > z in G n . The drawing 
illustrates the case r = 3. 
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The mean number of neighbors of a vertex in H n is r 2 > 9, so standard arguments show 
that there is a c > so that, with probability tending to 1 as n — > oo, there is a connected 
component K n of if n with \K n \ > cn. If [/ is a connected subset of with |{7| = L m J) 
then by building up U one vertex at a time and keeping it connected we get a sequence of 
sets {U m , m = 1,2, ... , [era J } with |C/ m | = m and \U^\ < 1 + (r — l)m. □ 

Since the isoperimetric constant is < r — 1, it follows that when q(r — 1) < 1, then for 



it 



< \A\. Computations from the 
bad sets. We have no idea how 



any e > there are bad sets A with \A\ < ne, so that E 
proof of Theorem [2] suggest that there large number o 
to bound the amount of time spent in bad sets, so we have to take a different approach to 
show persistence when 1/r < q < l/(r — 1). 



Theorem 3. Suppose qr > 1. If So is small enough, then for any < 5 < 5q, there are 
constants C(S) > and B(5) = (1/8 — 25) log(gr — <5)/logr so that as n — > oo 

inf P >p-2<A -»• 1. 



t<exp(C(<5)-ra B ( a )) V n 

To prove this, we will again investigate persistence of the dual. Let 

do(x, y) = length of a shortest oriented path from x to y in G n , 
d(x,y) = mia[d {x,z) +d {y,z)], (1.5) 

and for any subset A of vertices let 

m(A, K) = max{|S'| : d(x, y) > K for x, y G S, x ^ y}. (1.6) 

Let R = logra/logr be the average value of do(l,y), let a = 1/8 — 5 and B — (a — 5) log(gr — 
8)/\ogr. We will show that if m \^f,2\aR\j < [^-^Jl at some time s, then with high 

probability, we will later have m (if, 2\aR]j > \n B \ for some t > s. To do this we explore 

the vertices in G n one at a time using a breadth-first search algorithm based on the distance 
function d . We say that a collision has occurred if we encounter a vertex more than once 
in the exploration process. First we show in Lemma 13.11 that, with probability tending to 
1 as n — > oo, there can be at most one collision in the set {u : da(x, u) < 2\aR\} for any 

x G V n . Then we argue in Lemma \3. 21 that when we first have m [if, 2\aR\j < \n B \, there 

is a subset iV of occupied sites so that \N\ > (q — 5) [n B \ , and d(z, w) > 2 \aR~\ — 2 for any 
two distinct vertices z,w G N, and {u : do(z,u) < 2\aR~\ — 1} has no collision. We run the 
dual process starting from the vertices of iV until time \aR] — 1, so they are independent. 



With high probability there will be at least one vertex w G iV for which 



£{w} 



> \n B ]. 



By the choice of N, for any two distinct vertices x, z G ijam-ii d(x,z) > 2\aR]. It seems 
foolish to pick only one vertex w, but we do not know how to guarantee that the vertices 
are suitably separated if we pick more. 
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2 Proof of Theorem U 

We begin with the proof of the isoperimetric inequality, Theorem El 

Proof of Theorem 2. Let 77(777, k) be the probability that there is a set U with \U\ = m and 
\U*\ — k. First we will estimate p(m,£) where i = \_(r — 1 — r])m\. 

p(m,£) < V ( U * = u ')< Yl ¥ ( U * c f/ ')- 

{(u,u'y.\u\=m,\u'\=iy {(u,u'y.\u\=m,\u'\=£} 

According to the construction of G n , for any x G U the other ends of the r edges coming out 
of it are distinct and they are chosen at random from V n \ {x}. So 



F{U* C U') 

and hence 



n 



< 



n — 1 



rm 

n \ n 



^"^L U — I • (21) 



To bound the right-hand side, we use the trivial bound 

<-r< ( — ) , (2-2) 
where the second inequality follows from e m > m m /m\. Using ( 12.2ft in ( 12.11) 



p(m,£) < (ne/m) m (ne/£)^ 



rm 

n 



n J \ n — 1 



Recalling £ < (r — 1 — 77)777., and accumulating the terms involving (m/n),r — 1 — 77 and e 
the last expression becomes 

< e m(r - r '\m/n) m[ - 1 - {r - 1 - ri)+r] (r - 1 - ri)~ (x ~ 1 ~ ri)m ^ nn [n/(n - l)] rm 
= e m{r - v \m/n) mv (r - 1 - 7]) m{1+ ^ [n / (n - l)] rm . 

Letting c(rj) = r — r] + rlog(n/(n — 1)) + (1 + 77) log(r — 1 — 77) < C for 77 G (0, r — 1), we have 

p (m, \_(r — 1 — 77)771]) < exp (— r]mlog(n/m) + Cm) . 

Summing over integers k = (r — 1 — 77')m with 77' > 77, and noting that there are fewer than 
rm terms in the sum, we have 

P [E(m, (r — 1 — 77)777.)] < exp(— 77777 log (77/777) + C'm). 

To clean up the result to the one given in Theorem[2l choose e such that 77 log(l/e )/2 > 
C . Hence for any 777 < e n, 

77log(77/m)/2 > 7/log(l/e )/2 > C , 
which gives the desired result. □ 
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Our next goal is to show that the graph G n locally looks like a tree with high probability. 
For that we explore all the vertices in V n one at a time, starting from a vertex x, and using 
a breadth- first search algorithm based on the distance function do of ( 11. 51) . More precisely, 
for each x G V n , we define the sets A*, which we call the active set at the k th step, and R k , 
which we call the removed set at k th step, for k = 0, 1, ... , /3 X , where f3 x = min{Z : A^ = 0}, 
sequentially as follows. R x = and A° = {x}. Let D(x,l) = {y : d (x,y) < I}. For 
< A; < /3 X , we get fc = min{Z : < I < k, A k n Z) 7^ 0}, and choose x k G A* n -D(x, fc ) 
with the minimum index. 

If x fc G i& then A* +1 = A* \ = i?* and 

if x fc £ i& then A* +1 = A k x U { yi (x fe ), . . . , y r (x fc )} \ {x k }, R k x +l = R k x U {x fc }. 

If Xfc G R k , we say that a collision has occurred while exploring G n starting from x. The 
choice of x k ensures that while exploring the graph starting from x, for any j > 1, we consider 
the vertices, which are at do distance j from x, prior to those, which are at do distance j + 1 
from x. 

The next Lemma shows that with high probability R x will have k vertices, and for x 7^ z, 
R k and R z do not intersect each other, when k < n 1//2_l5 . For the lemma we need the 
following stopping times. 

irl = min {I > 1 : \R l x \ < l} , 

n XtZ = min {l>l:R l x nR l z ^(ft},x^z, 

a n / = min {l > 1 : > [n 1/2 - 5 ] } , 5 < 1/2, (2.3) 
P x = mm{l>l:A l x = 0} 

So 7r* is the time of first collision while exploring G n starting from x, and tt XjZ is the time of 
first collision while exploring G n simultaneously from x and z. 

Lemma 2.1. Suppose < 5 < 1/2. Let l\, x G V n , and l X)Z , x, z G V^x 7^ 2;, fee the events 

II = {irl A % > a n /} , I x , z = II n I* n K, z > V a n /} , 
where 7il,7i XiZ ,a x ' s and (3 X are the stopping times defined in f 1 2 . 3 1) . T/ien 

F [{Il) c ] < n-*\ F(F XtZ ) < 5n~ 2S (2.4) 

/or /arye enough n. 

Note that the randomness, which determines whether the events I x and I XjZ occur or 
not, arises only from the construction of the random graph G n , and does not involve the 
threshold contact process £ x on G n . 
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Proof. Let 5' = 1/2 — 5. Since in the construction of the random graph G n the input nodes 
Ui{z),l < i < for any vertex z are distinct and different from z, there are at least n — r 
choices for each yi(z). Also \R l x \ < I for any /. So 

P(|ig|H^ _1 |)<(*-l)/(n-r). (2-5) 
It is easy to check that n l x A (3 X > a n / if \R k x \ ^ \R k ~ x \ for k = 1, 2, ... , [n 5 '] . So 



p [«n < p [utv (1^1 = I^Dj < E p (i^i = i*£ _1 

fc=i 

rn«'i 

< ^ (fc - l)/(n - r) < n 2(5 7n = n" 2<5 
fc=i 



for large enough n. For the other assertion, note that I XjZ occurs if \R X \ 7^ 1 |, 7^ 
\R h z - l \ and R k (~) R k = for A; = 1,2,..., [n 5 '] . Also if for some k > I R k (~) R k ^ $ and 
i£ n i£ = for all 1 < Z < Jfe, then either R k = R k ~ l U and x k -i G R k ~\ or 

R k = R^ 1 U {zfc-i} and Zk-i G i?^. Now since each of the input nodes in the construction 
of G n has at least n — r choices, and \R l x \, \R Z \ < Z for any Z, 

P n i2* ^ 0, R l x n i?i = 0, 1 < Z < k) < P (a;*_i G i^J+P (^_i G i£) < (2Jfe-l)/(n-r) 

(2.6) 

Combining the error probabilities of (12.51) and (12.61) 

P (4%) < P [u^ 1 (\R k x \ = \R k x -'\) u^ 1 (|i£| = \R k -'\) ufc? (R k x nR k ^ $) 
< J2 [ p (1^1 = l^ 1 !) + p (\R k z \ = + p (Jg n J2j ^ 0, i£ n #, = 0, 1 < / < fc)] 

k=l 

<j2( 4k - 3 )/(™ - r ) ^ 5n25 '' 1 = 5n ~ 2S 

k=l 

for large n. □ 

Lemma 12.11 shows that G n is locally tree-like. The number of vertices in the induced 
subgraph G X) m with vertex set G n PI {u : d (x, u) < M} is at most 1 + r + • • ■ + r M < 2r M . 
So if I x occurs, then, for any M satisfying 2r M < n l l 2 ~ s , the subgraph G X) m is an oriented 
finite r— tree, where each vertex except the leaves has out-degree r. Similarly if I XjZ occurs, 
then for any such M, G Xj m H G z ,m = 0- 

In the next lemma, we will use this to get a bound on the survival of the dual process 
for small times. Let p be the branching process survival probability defined in (11.11) . 
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Lemma 2.2. If q > 1/r, 8 G (0, qr — 1), 7 = (20 logr) 1 , and 6 = 7 log(gr — 5) then for any 
x e V n , if n is large, 



> \n b ] )>p-S. 



^ [27 log n] 

Proof Let I x be the event 

/^{TriAA^Qf 1 / 4 }, 

where tt^., a™' 1//4 are as in (12. 3ft . Let be the distribution of a branching process 
Z x = {Zf : t — 0, 1, 2, . . .} with Zq = 1 and offspring distribution p = 1 — <? an d p r = (Z- 
Since 9 > 1/r, this is a supercritical branching process. Let B x be the event that the 
branching process survives. Then 

P Zx (B x )=p, 

where p is as in (11.11) . If we condition on B x , then, using a large deviation result for branching 
processes from Athreya (1994), 



p 



z x 



Zf 



qr 



> 5 



(2.7) 



for some constant c(5) > and for large enough t. So if F x = {Z^ +1 > (qr—5)Z^ for [7 log n\ < 
t < [27 log n~|}, then 

(r2 7 lognl)-l 

P Z *(F*\B X ) < e ~ 4S)t ^ C 5 n- c( - 5 ^ 2 

t= L7 log "J 

for some constant C$ > and for large enough n. On the event B x D F x , 



(2- 



^f2 7 logn 



> (qr — (5) r 2 7l°g«1-L7l°gnJ > (g r _ £)7l°g« _ n 7log(gr-<5) ^ 



since ^f 7lognJ > 1 on 5 Z . 



Now coming back to the dual process £ , let Pji denotes the conditional distribution of 



£ L ' given I x . This does not specify the entire graph but we will only use the conditional law 
for events that involve the process on the subtree whose existence is guaranteed by I x . By 
the choice of 7, the number of vertices in the subgraph induced by {u : d (x, u) < [27 log n\ } 
is at most 2r'" 27logri "' < n 1 ^. Then it is easy to see that we can couple Pji with P z * so that 



Pn 



{x} 



0<t< [2 7 log n] 



P z * [(Z?,0<t< r2 7 lognl) e 



Combining the error probabilities of (12.41) and (12.81) 



£{x} 

> [27 log n] 



> Pr 



£{x} 

^ |"2 7 log n] 



> \n b ])nix) 

= Pz* (Zf 2llogn] > \n b ])n^) 

> Pz*{B a nF x )F(ll) 

= P zx (B x )P zx (F x \B x )¥(ll) 

> p{l-C 5 rr< 5 ^' 2 ) {l-n- 1 ' 2 ) >p-5 
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for large enough n. 



□ 



Lemma 12.21 shows that the dual process starting from one vertex will with probability 
> p — 5 survive until there are \n b ~\ many occupied sites. The next lemma will show that if 
the dual starts with \n b ] many occupied sites, then for some e > it will have \en] many 
occupied sites with high probability. 

Lemma 2.3. If q(r — 1) > 1, then there exists e\ > such that for any A with \A\ > 

o A 

the dual process £ satisfies 

P I max if <e x n \ < exp (-n 6/4 ) . 

\t<\t 1 n-n b ~\ 

Proof. Choose rj > such that (q — rj)(r 
Theorem [2J Take t\ = 60(77). Let v = min <j 
and 



- 1 




j)) > 1, and let e (r/) be the constant in 


{«: 


if 


> \e in ]j. Let F t = { 


4 A 


> 


PA 
st-i 





5 t = < at least (9 — 77) 



occupied sites of ^ give birth 



C t = {|C/ t *| > (r - 1 - 77) I C/* I } , where U t = (x E if : x gives birth} . 
Now if B t and C t occur, then 



1^*1 > (r - 1 - rj)\U t \ > (r - 1 - rj){q - 77) 



> 



(2.9) 



i.e. F t+1 occurs. So D B t nC t for all t > 0. Using the binomial large deviations, see 
Lemma 2.3.3 on page 40 in Durrett (2007), 



Pg [Bt I if) > 1 - exp -r((q - V )/q)q 



it 



where T(x) = xlogx — x + 1 > for x ^ 1. If we take H 



> \n b ] \ and H t 



(2.10) 



n* F 



then 



> \n b ] on the event H t for allt > 0. Keeping that in mind we can replace 



111 



the right side of (12. lQj) by n b to have 

P G (B c t nH t ) <P G (Bt n[\if\ > \n b ]^<ex P (-T((q- V )/q)qn b ) Vt > 0. (2.11) 

The same bound also works for the unconditional probability distribution P. Next we see 
that Pa(Ct\U t ) > 1e c , where E = E(\U t \, (r — 1 — r])\U t \), as defined in Theorem [21 Taking 
expectation with respect to the distribution of G n , P(C t \U t ) > F(E C ). Since for t < u, 
\U t \ < eo(7/)n, and \U t \ > {q — f])n b > n b /(r — 1) on H t fl B t , using Theorem [2] 



p(C t c n B t n fl- t n {t < v}) < P[Q C n {{n b /{r - i)) < |^| < ei 77}] 



< exp I — — - 



2r 



log 



n(r — 1) 



(2.12) 
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Combining these two bounds of (12. lip and (12.121) we get 

P(F t c +1 n H t n {t < v}) < P((B t nC t ) c nH t n{t<u}) 

< P(B c t n H t ) + P(C t c n B t n # t n {t < i/}) < ex P (-n 6/2 ) 

for large n. Since v < \e\7i — n b ] on Hr ein _ n b-i, 

P(u> \e x n -n b ]) < P \(u > \e x n - n b ]) n fu£ 1 1 n " n6l F 1 



< ^ P(F, c n^_ 1 n{z/>t-i}) 



< ( [em - n b ] ) exp {-n b/2 ) < exp (-n b/4 ) 



for large n and we get the result. 



□ 



The next result shows that if there are [era] many occupied sites at some time for some 
e > 0, then the dual process survives for at least exp(cn) units of time for some constant c. 



Lemma 2.4. If q(r — 1) > 1, then there exist constants c > and e\ > as in Lemma \2.3\ 
such that for T = exp(cn) and any A with \ A\ > \e\n\ , 



P ( inf 

t<T 



< e\n < 2exp(— cn) 



Proof. Choose rj > so that (q— rj){r— 1— rf) > 1, and then choose eo(ry) > as in Theorem [2j 

Take e± = eo(v)- For any A with \A\ > \e\n\ , let U[ = |x G £ t A : x gives birth j, t = 0, 1, 

If \U' t \ < [ e i n j! then take Z7* = £//. If \U' t \ > e±n, we have too many vertices to use Theorem 
[2j so we let U t be the subset of U[ consisting of the [ e i n J vertices with smallest indices. Let 



F t 



> [em] , 



at least (q — rj) | t many occupied sites of £^ give birth 



a = {|0?|>(r- 1-^)|^|}. 

Now using an argument similar for the one for (12 .9p . F t+ i(lH t D B t r\C t DH t for any t > 0. Us- 
ing our binomial large deviations result (I2.10p again, P G [ B t \^\ > 1— exp ( — — 77)/ g)g ^ 



On the event F+. 



> \e\n\ , and so 



p G {B c t nH t ) <P G [B c t n 



> [ei^l M < exp (-r((g - r])/q)qe 1 n) . 



The same bound works for the unconditional probability distribution P. 
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Since \Ut\ < sin, and on the event H t r\B t \Ut\ > (q — i])ein > e\nj{r — 1), using Theorem 
[2] and similar argument which leads to (12.121) we have 

P(Q nmn B t ) < ex P (-l-P^r log r ' 



2r - 1 



Combining these two bounds 



where 



p(F t c +1 nH t ) <P[(B t nC t ) c nH t ] 

< V{B c t n iff) + P(C t c H5,n # f ) < 2exp(-2c(7?)n), 



1 i f Q ~ v\ V e i r — 1 
c(77) - -mm T gei, -log 

2 I V 9 / 2 r - 1 ex 



Hence for T = exp(c(?7)n) 



P ( inf 

t<T 



< em ) < P ( uffli? 



m-i 



< P ( F t+i n G <) < 2Texp(-2c(r/)n) = 2 exp(-c(r/)n). 



□ 



which completes the proof. 

Lemma 12.41 confirms prolonged persistence for the dual. We will now give the 
Proof of TheoremUi Choose 5 G (0, qr — 1) and 7 = (20 log r) -1 . Define the random variables 



1 < # < so that = 1 if the dual process £ starting at x satisfies 



for 6 = 7log(gr — 5), and = otherwise. By Lemma [2.21 if n is large, then 

EK,; > p — 5 for any x. 



[27 log n] 



> \n b ] 



Let 7T^, tTje^ and a? 3 '" be the stopping times as in ( 12.31) . and I*, be the corresponding 
events as in Lemma [27T1 Recall that G X) m is teh subgraph with vertex set V n n {u : d (x, u) < 
M}. On the event I x>z , G x ,\2-yiogn] and G z ^\ogn\ are oriented finite r— trees consisting of 
disjoint sets of vertices, since 2r^ 7logn ^ < n 1 ^ 5 by the choice of 7. Hence if Pj x z is the 

conditional distribution of \$,^ * 



$iven /a, 2 , then 



^ (& W ,0 < £ < [2 7 lognl ) G ■, (£ z *,0 < t < ^logn] ) G 



| t W ,0<t< [2 7 lognl) G 



4 {2} ,0<t< [2 7 lognl) G- 
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Having all the ingredients ready we will now estimate the covariance between the events 
{Y x = 1} and {Y z = 1} for x 7^ z. Standard probability arguments give the inequalities 

P(Y X = 1,Y Z = 1) < P[(Y x = l,Y z = l)nl x , z ]+F{IZ tg ) 
= P^ z (Y x = 1, Y z = 1)P(/^) + P(4%) 

= Pi*A Y * = l ) p i*A Y * = WW + p (4%) 

= P[(F, = 1) n /, JP[(y = l) n /, >Z ]/P(4, Z ) + P(4 C ,J 

< P(y a = i)P(y = i)/Pfc) + P(4 C ,J- 

Subtracting P(K C = l)P(y = 1) from both sides gives 

p(n = 1, n = 1) - p(y a = i)p(n = i) 

1 



< p(y x = i)p(n = 1) 

< p(4%)[i + i/p(j^)], 



P(4, 2 



1 + P(/L) 



(2.13) 



where in the last inequality we replaced the two probabilities by 1. Now from Lemma [2.11 

P(4,J < 5n~ 3 / 5 , and so 

P(Y X = 1,Y Z = 1) - P(y x = l)P(y = 1) < 5n~ 3/5 (l + 1/ (l - 5n~ 3/5 )) < 15r^ 3/5 
for large enough n. Using this bound, 



var 



y, J < + 15n(n — l)n 



■3/5 



,a;=l 



and Chebyshev's inequality shows that as n — ► 00 

,\ n + 15n(n — l)r?r 3 / 5 
^ - uir x ; £ nS I < 



n 2 5 2 



J2(Y* ~ EY X 

x=l 

Since EY X > p — 5, this implies 

lim P ( V y x > nip -28) ) = 1. 



0. 



(2.14) 



v a;=l 



Our next goal is to show that £j. contains the random set D = {x : Y x — 1} at T = T1+T2, 
a time that grows exponentially fast in n,. We choose rj > so that (g — 77) (r — 1 — rj) > 



1. Let e\ and 0(77) be the constants in Lemma 12.41 If Y x = 1, then 



> \n b ~\ for 

T\ = [27 log n] . Combining the error probabilities of Lemmas 12.31 and 12.41 shows that for 
Ti = \_exjp(c(r))n) \ + \e\n — n b ~\ , and for any subset A of vertices with \A\ > \n b ~\ 



T-2 



> [ein]) > 1 -3exp(-n b/4 ) 



(2.15) 
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for large n. 

Let C be the set of all subsets of V n of size at least \n b ], and denote C x = £^ ■ Using 
the duality relationship of ( 11. 3ft for the conditional probability distribution 



£} x \o<t<T u xeV n ), 



we see that 



P(-) = P 



V(& 1+T2 2D) = v [n xeD {x g e Tl+T2 )] 



V 



n, 



Since D = {x : Y x — 1}, it follows from the definition of Y x that C x G C for all x £ D. So by 
the Markov property of the dual process the above is 



E p 

c x eC,xeD 

E p 



n, 



o^eC.xeD 



n 



xeD I ?Ti 



:{*} 



a 



Using (127L51) P f|§ ^ 0^ > P ( §g > fan]) > 1 - 3exp {-n b / A ). So the above is 



> (l-SlDlexp^/ 4 )) V 

C x £C,x£D 

> 1 - 3nexp (-n b/i ) . 



For the last inequality we use \D\ < n and V(Y X = lVx G D) — 1. Since the lower bound 
only depends on n, the unconditional probability 

P (£t 1+ t 2 2 {ar : n = 1}) > 1 - S^exp (-n b ^) . 

Hence for T = T1+T2 using the attractiveness property of the threshold contact process, 
and combining the last calculation with (12.141) we conclude that as n — > 00 



inf P ^ > p - 28 

t<T V n 



> P 2 : r a = 1}, ^ K > n(p - 28) ] -> 1. 
This completes the proof of Theorem [H 



□ 
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3 Proof of Theorem 



3 



Recall the definition of the active sets A x , k = 0, 1, . . . , /3 X , and the removed sets R x , k = 
0, 1, . . . , p x , introduced before Lemma [27T1 Also recall the stopping times ir x and a x ' 5 in (12.31) 
and define 

ti 2 x = min [l > nl : \R l x \ < I — 1} . 

This is the time of second collision while exploring G n starting from x. First we show that 
with high probability for every vertex x £ V n the second collision occurs after [n 1 / 4-5 ] many 
steps for any 5 G (0, 1/4). 

Lemma 3.1. Let 5 G (0, 1/4) and I x be the event 

Then for I = n x< zv n I x , P(/ c ) < 2n~ 4S for large enough n. 

Proof. Let 5' = (1/4) —5. Since in the construction of the random graph G n the input nodes 
Vi(z)i 1 < i < r > f° r an y vertex z are distinct and different from z, there are at least n — r 
choices for each yi{z). Also \R l x \ < I for any I. So f(\R x \ = l-R^T 1 !) < (k - l)/(n - r). 
Now if I x fails to occur, then there will be k\ and k 2 such that 1 < k\ < k 2 < \n s '~\ and 
\R% \ = ji?^- 1 ! for i = 1,2. So 

p[(^ 2 ) c ] < E pd^hl^Mithli*- 1 !) 

i<fci<fc 2 <r«' 5 'i 

< y. (fc! - l)(fc5 - 1) < y 2 (fc 1 -l)(fc 2 -l) <2n 4f- 8 

i<fci<fe 2 <r™ ,5 'i i<fci,fc2<rn ,s 'i 

for large enough n. The second inequality holds because the choices of the input nodes are 
independent. Hence P(J C ) < J2 x ev n P [^If] < 2n 4<5 ' _1 = 2n~ 4S . □ 

Lemma 13.11 shows that with high probability for all vertices there will be at most one 
collision until we have explored [n 1 / 4-5 ] many vertices starting from any vertex of G n . Now 
recall the definition of the distance functions do and d from (11.51) . and m(A, K) given in (II. 6p . 
Let R = logn/ logr, a = (1/8 — 5) and let p be the branching process survival probability 
defined in (11.11) . 

Lemma 3.2. Let Pi denote the conditional distribution of £ , x G V n given I, where I is 
the event defined in Lemma lff.il If qr > 1 and 5q is small enough, then for any < 5 < 5q 
there are constants C{8) > 0, B(S) = (1/8 — 25) log(gr — 5)/ logr and a stopping time T 
satisfying 

Pj (T < 2 exp (C(S)n m ) ) < 2 exp [-C(S)n B{5) ] , 



such that for any A with m(A,2\aR]) > [n B(<5) J, £t > [n B{5 ' ) \ . 



,B{5) 
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Proof. Let m t = m 2\aR\j. We define the stopping times o~i and Ti as follows. a = 0, 
and for % > 

T i+1 = min {t > Oi : m t < [n B \ } , 
= min {t > T i+ i : m t > [n B \ } . 

Since r« > <7j_i for z > 1, m Ti _i > [n^J , and hence there is a set Xj C of size at 
least \_n B \ such that d(u,v) > 2\aR~\ for any two distinct vertices u, v G Xj. Let E{ be the 
event that at least (q — 5)\Xi\ many vertices of Xj give birth at time r^. Using the binomial 
large deviation estimate (12.101) 

Pa(Ei) > 1 - exp (-r((g - 5)/q)q[n B \ ) , (3.1) 

where T{x) = xlogx — x + 1. 

Now let I be the event defined in Lemma 13.11 Since \{z : do(x,z) < 2|~ai2~|}| is at 
most 2r 2 ^ aK] < 2rn 2a < n 1 / 4 s , so if / occurs, then for any vertex x G V n there is at 
most one collision in {z : d (x, z) < 2\aR\}, and hence there are at least r — 1 input nodes 
Ui(x), . . . ,u r -i(x) of x such that {z : d (ui(x),z) < 2\aR] — 1} is a finite oriented r— tree 
for each 1 < % < r — 1. Since the right side of 13.11 depends only on n, 

Pj(I H ^) = Pj(30 > 1 - exp (- Cl (5)n B ) , 

where Ci(5) = T((q — 5)/q)q/2. If / D Ei occurs, then we can choose one suitable offspring 
of each of the vertices in Xj, which give birth, to form a subset N$ C ^ such that \N^\ > 
(q — 5) \ n B \ , d(u, v) > 2 \aR\ — 2 for any two distinct vertices u,v G N t , and {z : d (u, z) < 
2\aR~\ — 1} is a finite oriented r— tree for each u G Xj. 

By the definition of Xj it is easy to see that for each iGiV; 



Pi 



( H X} , < t < 2\aR\ - l) G ■ = P Z x [(Zf, < t < 2\aR~\ - 1) G •] 



where Z x is a supercritical branching process, as introduced in Lemma I2T2"} with distribution 
Pz* and mean offspring number qr. Let B x be the event of survival for Z x , and F x = 
^t^[SR\-i {^t+i — ( , ? r ~ ^° Pz*(Bx) — P > as in (11.11) . Using the error probability 

of (E2D 

Pz*{F°\B x ) < e ~ C ' i5)t ^ C s e- c '^ 5lo ^ 2l °^ = C s n~ c '^ 21 °^ (3.2) 

t=L<5fiJ-l 



for some constants Cg, c'{5) > 0. On the event B x n i 7 ^, 



Zf aH l-i > (gr- - <j)(r*Hl-i)-(L«J-i) > ( gr _ 5)(*- 



S)R _ n (a-<5)log( g r-5)/logr _ ^ 
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Hence for Q x = 
(j32jl to have 



>>\aK\-l 



> \n ] f for i € Wj, we use standard probability arguments and 



Pl{Q x ) = Pi(\itk- 1 \>\n B ])=Pz4 Z ^-^\ nB V 
> P Z *(B X n F x ) > P Z *{B X )P Z *{F X \B X ) >p-5 



(3.3) 



for large enough n. 

Since d(u,v) > 2|~aP] — 2 for any two distinct vertices u, v G iVj, £ 1 * is a disjoint union 
of £ t over x a Ni for t < |~aP] — 1. Let Hi be the event that there is at least one x G Ni 
for which Q x occurs. Then recalling that \Nj\ > (q — 5) \ n B J on Ei, 



Pi(H?\Ei) < (1 - p + 5)^- 5 )L« fl J = eX p (-c 2 (5)n B ) 



(3.4) 



where c 2 {5) = {q - 5) log(l/(l - p + S))/2. 

If P^ fl Ei occurs, choose any vertex Wi G Ni such that Q Wi occurs and let Si = ium-v 
By the choice of Wi, \Si\ > \n B \. Since (\aR\ — 1) + |~aP] = 2|~aP] — 1, for any two distinct 
vertices x,z G Si the subgraphs induced by {u : do(x, u) < \aR\} and {u : do(z, u) < \aR\} 
are finite r— trees consisting of disjoint sets of vertices, and hence d(x, z) > 2|~aP~|. Hence 
using monotonicity of the dual process Oi < Tj + |~ai2] — 1 on this event Pj fl Pj. So 

Pjfa > r, + [aP] - 1) < P,(Pf) + Pi{Ht\Ei) < 2exp(-2C(5)n B ), 
where C(5) = min{ci(5), c 2 (S)}/2. Let L = inf{i > 1 : ^ > Tj + [aP] - 1}. Then 



Pf [L > exp (C(5)n B )] > [l - 2exp(-2C(5) 



n 



S s-iexp(C(5)n £ 



> 1 -2 exp (-C{5)n B ) . 



Since <7j > Tj > 0j_i, ol-i > 2(L — 1). As 

T = GL-l- 



&L-1 



> [n J , we get our result if we take 

□ 



As in the proof of Theorem[H survival of the dual process gives persistence of the threshold 
contact process. 

Proof of Theorem^ Let < 5 < S , p, a = (1/8 - 5) and B = (1/8 - 25) \og{qr - <5)/logr 
be the constants from the previous proof. Define the random variables Y x , 1 < x < n, as 



S[aR]-l 



> |_n J and F x . = otherwise. 



Y x = 1 if the dual process £ starting at x satisfies 

Consider the event l\ = A (3 X > a™' 1 ^ 4+<5 |, where ir x , (3 X and a™' 1 ^ 5 are stopping 
times defined as in ( 12. 3p . Using Lemma [2.11 and I3TT1 



(3.5) 



P/ [(4) ] < L\!; J < - — - — - < 2n- {1/2+25) . 



P(J) 



1 - 2n 



-45 
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Let J x = J n J* and P^ be the conditional distribution of £ given J^.. Since the number 
of vertices in the set {u : do(x,u) < \aR\ — 1} is at most 2r^ aR ^ 1 < 2r aR < n 1 ^ 4 ^ 5 by the 
choice of a, 



P., 



{,} 



< t < \aR] — 1 ) G ■ =P Z * < t < \aR] — 1) G 



where Z x is a supercritical branching process, as introduced in Lemma [2T2l with distribution 
Pz* and mean offspring number qr. Let P x and F x = n[^j i? j 2 _ 2 {Zf +1 > (qr — }. So 
Pz*(B x ) = p > as in (II. ip . and similar to (13.21) 

Pz*{F:\B x ) < J2 e ~ C ' (5)t ^ C 5 n' c '^ 6 /^°^ 

t=\5R\~2 

for some constants C s , c'{5) > 0. On the event B x n F x , Zfa_ x > {qr - £)(r«ff|-i)-(L«J-2) > 
(gr - <5) (a ~ 5)jR > [n B J. Hence using ( 13~5|) 



p/(n = i) > Pi(i^{\iU^\>[n B \} 



P 



^\aK\-l 



= Pz*{Z^-i>[ri B \)PAll) 

> P Z *{B X H FjPjill) = Pz*(B x )Pz*{F x \B x )Pi{ll) >P~6 

for large enough n. 

Next we estimate the covariance between the events {Y x = 1} and {Y z = 1}. We 
consider the stopping times k x , f3 x , 7T X>Z , a x ,1 ^ 4+S as in (12.31) and the corresponding event I XjZ 
as in Lemma [2. 11 We can use similar argument, which leads to (12.131) . to conclude 

Pj(Y x = 1,Y Z = 1)- Pj(Y x = l)Pj(Y z = 1) < P 7 (4%)(1 + l/P/(4, 2 ))- 

From Lemma [2.11 and I3TT1 

5^-2(1/4+5) 



Pl(I C X J < 



< 



P(J) ~ 1-2N~ 4S 
for large enough n, and so 

Pj(Y x = 1, Y z = 1) - P 7 (n = 1)P X (Y, = 1) < 30n-tV2+2*) 
for large n. Using the bound on the covariances, 



var/ | Y x J < n + 30n(n — l)n" 



-2d 



v x=l 



and Chebyshev's inequality gives that as n — > oo 



Pi 



x=l 



>n n + 30n(n-l)n-^_ >a 



n 2 <5 2 



Since EF X > p — 5 for all x G V^, this implies 



limPj Vn>n(p-25) =1. 



(3.6) 



Our next goal is to show that £j> contains the random set P = {a; : = 1} with high 

> |_n B J, where T\ = \aR\ — 1. 



probability for a suitable choice of T. If Y" x = 1, then 

Note that \aR\ — 1 + [aP] < 2 [aP] , and on the event I there can be at most one collision 
in {u : do(x,u) < 2|~aP]}. Even though the first collision occurs between descendants of two 
vertices in still we can exclude one vertex from £jf^ to have a set W x C of size at 
least [n B \ such that for any two distinct vertices z,w G H^, the subgraphs induced by {u : 
do(z, u) < \aR\} and {v : do(w,v) < \aR\} are finite oriented r— trees consisting of disjoint 

sets of vertices, i.e. d(z, w) > 2 \aR\ . So if Y x — 1, then m ^j^, 2 |~aP] J > [n^J on the event 

I. Using Lemma [3T2l after an additional T 2 > 2exp (C(S)n B ^j units of time, the dual process 
contains at least [n B \ many occupied sites with Pj probability > 1 — 2exp (— C(S)n B ). 

Let T be the set of all subsets of V n of size > \n B \, and denote F x = £^ . Using the 
duality relationship of (11. 3p for the conditional probability Pj(-) = V(-\I), where 



ef } ,0<t<T 1; xGK 



we see that 



Vi(e Tl+T ^D) = V I [n xeD (xee Tl+ T 2 )} 

Since D = {x : Y x = 1}, F x G T for all x G D. So by the Markov property of the dual 
process the above is 



E * 

E P- 

F x £T,x&D 



{x} 



Pi 



Now since PU X C F x , using monotonicity of the dual process, Pi ^ J > Pj (^z^ 7^ 
Also using Lemma E21 Pj ( > L™ B j) > 1 - 2exp (-C(<5)n B ) for any F x G P. So the 
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above is 

> (l-2\D\exp(-C(6)n B )) £ V I [n xeD ^ } = F x 

> 1 -2nexp (-C(5)n B ) . 

For the last inequality we use \D\ < n and Vi(Y x = lVx G D) = 1. Since the lower bound 
only depends on n, 

Pi (e Tl + T2 2 {x : Y x = 1}) > l-2nexp(-C(5)n B ) 
=> P 2 {x : n = 1}) > P(J) [l-3nexp(-C(5)n B )] 1, 

as n — > oo, since P(I) > 1 - 2n- AS by Lemma EU 

Hence for T = 7\ + T 2 using the attractiveness property of the threshold contact process, 
and combining the last calculation with ( 13. 6ft we conclude that as n — > oo 

infP^>p-2^ =pf!Sl> p -2^ 

>P^2{i:n = 1}, n > "(P - 25) J ^ 1, 
which completes the proof of Theorem [31 □ 

References 

Albert, R. and Othmer, H. G. (2003) The topology of the regulatory interactions predicts 
the expression pattern of the segment polarity genes in Drosophila melanog aster. Journal of 
Theoretical Biology, 223, 1-18 

Athreya, K.B. (1994) Large deviations for branching processes, I. Single type case. Ann. 
Appl. Prob. 4, 779-790 

Chaves, M., Albert, R., and Sontag, E.D. (2005) Robustness and fragility of Boolean models 
for genetic regulatory networks. J. Theor. Biol. 235, 431-449 

Derrida, B. and Pomeau, Y. (1986) Random networks of automata: a simplified annealed 
approximation. Europhysics Letters, 1, 45-49 

Durrett, R. (2007) Random Graph Dynamics. Cambridge University Press. 

Flyvbjerg, H. and Kjaer, N. J. (1988) Exact solution of Kaufmann's model with connectivity 
one. Journal of Physics A, 21, 1695-1718 

Griffeath, D. (1978) Additive and cancellative interacting particle systems. Lecture Notes in 
Mathematics, 724. Springer, Berlin, 



20 



Kadanoff, L.P., Coppersmith, S., and Aldana, M. (2002) Boolean dynamics with random 
couplings. |arXiv:nlin. AO /0204062] 

Kauffman, S. A. (1969) Metabolic stability and epigenesis in randomly constructed genetic 
nets. Journal of Theoretical Biology, 22, 437-467 

Kauffman, S. A. (1993) Origins of Order: Self- Organization and Selection in Evolution. 
Oxford University Press. 

Kauffman, S.A., Peterson, C, Samuelson, B., and Troein, C. (2003) Random Boolean models 
and the yeast transcriptional network. Proceedings of the National Academy of Sciences 110, 
14796-14799 

Li, F., Long, T., Lu Y., Ouyang Q., Tang C. (2004) The yeast cell-cycle is robustly designed. 
Proceedings of the National Academy of Sciences, 101, 4781-4786. 

Liggett, T. M.(1999) Stochastic Interacting Systems: Contact, Voter, and Exclusion Pro- 
cesses. Springer. 

Nyter, M., Price, N.D., Aldana, M., Ramsey. S.A., Kauffman, S.A., Hood, L.E., Yli-Harja, 
O., and Shmuelivich, I. (2008) Proceedings of the National Academy of Sciences. 105, 1897- 
1900 

Shmulevih, I., Kauffmann, S.A., and Aldana, M. (2005) Eukaryotic cells are dynamically 
ordered or critical but not chaotic. Proceedings of the National Academy of Sciences. 102, 
13439-13444 



21 



